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© Detecting apparatus of motion vector. 

© By expanding technique of detection of motion vector of time-space differential method, (1) repetitive 
calculations are not needed, (2) different motion vectors near boundary of object can be detected, and (3) 
coupling of region segmentation and framework is enabled. For this, a horizontal direction differentiating filter 
(104), a vertical direction differentiating filter (105), and a time direction differentiating filter (106) calculate the 
time-space differential (dl/dx, 6l/5y, dl/M) in formula (2) necessary for motion vector estimation in every pixel. A 
feature vector combining part (110) generates a sample vector coupling the time-space differential value and 
position for random imageposition (x, y), and clusters from pairs of samples and classes to minimize the 
difference in a maximum likelihood class determining part (112) and a maximum likelihood class data changing 
part (113). The motion vector is obtained from the third eigen vector of covariance matrix of the time-space 
differential value. 
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Fig. 2 
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BACKGROUND OF THE INVENTION 

1. Field of the Invention 

5 The present invention relates to a detecting apparatus of motion vector for coding image, reducing 
noise, tracing object, etc. 

2. Related art of the Invention 

10 A technique of motion vector detection on the basis of time-space differentiating method has been 
already proposed, for example, as Horn's method (Horn, BIKP. and B.C. Shunk: "Determining optical flow," 
Artificial Intelligence, Vol. 17, pp. 185-203, 1981). In this method, the brightness of point (x, y) on the image 
at time t is supposed to be I (x, y, t), and an object is assumed to move by Ax on the x-axis and by Ay on 
the y-axis during infinitesimal time At. Assuming the brightness of the same point on the object to be 

75 constant, formula (1) is established. 

/(x,y,f) = /(x+Ax,y+Ay,f+A*) (1) 

By Taylor series expansion of the right side of formula (1), higher order terms are ignored, and 
20 considering the extremity of At = 0 by dividing by At, formula (2) is established. 
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where u = dAx/dt, v = dAy/dt. Supposing (u, v) to be a velocity vector, it becomes a constraint 
equation by time-space differentiation. In this constraint equation, a straight line on which the motion vector 
(u, v) can be determined from the time-space differentiation of luminance, but a certain constraint is needed 

30 to determine (u, v). Horn, supposing that (u, v) changes smoothly on the image, applied a new constraint, 
and determined the motion vector. As a prior art of this detecting apparatus of motion vector, its structural 
drawing is shown in Fig. 1. In Fig. 1, numeral 301 is an A/D converter, 302, 303 are frame memories, 304 is 
a horizontal direction differentiating filter, 305 is a vertical direction differentiating filter, 306 is a time 
direction differentiating filter, 307 is a horizontal direction differential image memory, 308 is a vertical 

35 direction differential image memory, 309 is a time direction differential image memory, 310 is a horizontal 
direction motion memory, 311 is a horizontal direction motion estimating part, 312 is a vertical direction 
motion estimating part, 313 is a vertical direction motion memory, and 314, 315 are smoothing filters. 

The image entered through A/D is delayed by 1 frame time in the frame memory 302, and sequentially 
passes through the frame memory 303. 

40 The horizontal direction differentiating filter 304, vertical direction differentiating filter 305, and time 
direction differentiating filter 306 calculate the time-space differentiation (61/dx, dl/dy, 6V6i) of formula (2) 
necessary for estimation of motion vector in each pixel. 

The calculation results are respectively recorded in the horizontal direction differential image memory 
307, vertical direction differential image memory 308, and time direction differential image memory 309. 

45 Since the image is sampled, the calculation of differentiation is replaced by calculation of a difference 
method. 

Horn, supposing that (u, v) changes smoothly on the image, determined the motion vector in a form of 
minimizing formula (3). 

so E = / / o2((u, v)-(ii, v)) 2 + (ul x + vl y + tfdxdy (3) 

Hereinafter, for the sake of simplicity, (dl/dx, dl/6y, 61/dt) is expressed as (Ix, ly, It). Minimization of 
formula (3) can be solved by the repetition shown in formula (4)_in_a form of solving Euler-Lagrange 
equation obtained by partial differentiation of formula (3). Herein, (u, v) is the vicinity average of motion 
55 vector, k is the number of repetitions, and a is a constant. 

u* + 1 = + /y + Ma 2 + Ix 2 + I 2 ) 

v* +1 = v*-/y(/^+/ y v A +/Ma 2 + ^ 2 + / y 2 ) (4) 
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(Uk+i. v k+ i) in formula (4) is calculated respectively in the horizontal direction motion estimating part 311 
and vertical direction motion estimating part 312, and restored in the horizontal direction motion memory 
310 and vertical direction motion memory 313. 
5 The vicinity average necessary for calculation is calculated by the smoothing filters 314, 315. Thus, 
according to this conventional detecting apparatus of motion vector, the motion vector can be determined 
by motion estimation and vicinity smoothing. 

In the foregoing prior art, however, the following problems are present. 

(1) Repeated calculations of more than scores of times are generally needed, and the processing time is 
10 long. 

(2) Smoothing is necessary for determining the vicinity average, and hence different motion vectors near 
the boundary of the object are smoothed and are not determined accurately. 

(3) The estimation error is large on the boundary of the object. 

The second and third problems are due to mutual dependence of estimation of motion vector and 
15 region segmentation of image. For estimation of motion vector, certain coupling with framework of region 
segmentation is required. 

SUMMARY 

20 It is hence a primary object of the invention to solve the above problems, and enhance the precision of 
detection of motion vector on the object boundary of the detecting apparatus of motion vector, shorten the 
processing time. 

A detecting apparatus of motion vector of the present invention comprises: 
(a) a memory for holding video signals which are coded and composed of frame units, 
25 (b) a horizontal differentiating filter for reading out luminance data from the memory and differentiating 
(executing a differential method ) or differencing (executing a difference method) in a horizontal direction 
of arbitrary image coordinates, 

(c) a vertical differentiating filter for reading out luminance data from the memory and differentiating or 
differencing in the vertical direction of the image coordinates, 
30 (d) a time differentiating filter for reading out the luminance data of preceding and succeeding frames 
from the memory, and differentiating or differencing in a time direction, 

(e) feature vector combining means for combining luminance differential vector (Ix, ly, It) comprising 
results of the horizontal differentiating filter, the vertical differentiating filter and the time differentiating 
filter, and position vector (x, y) of the image coordinates .to obtain a sample vector, 
35 (f) a memory for holding a plurality of an estimation of a motion vector (u, v) and a position vector mean 
(x, y) of the sample vector as class data, 

(g) nearest class determining means for determining a distance between the class data and the sample 
vector from a formula (x-x, y-y) and a formula (u*lx + v*ly + It), and achieving a correspondence 
between the sample vector and nearest class data, and 
40 (h) class data changing means for changing the class data in a manner that a distance sum between the 
class data and one or more sample vectors becomes smaller, from one or more sample vectors and the 
class data of which correspondence has been achieved by the nearest class determining means. 
According to this constitution, the time-space differential (Ix, ly, It) is obtained from the memory holding 
the video signals composed in frame units, by means of horizontal differentiating filter, vertical differentiat- 
45 ing filter, and time differentiating filter. Furthermore, in the sample vector combining means, the above (Ix, 
ly, It) and the position vector_(x, y) of the image coordinates are coupled to obtain a sample vector. Then 
the position vector mean (x, y) and an estimation of the motion vector (u, v) of the sample vector, is held in 
a plurality as class data by the memory. The nearest class determining means selects the one closest to 
the sample data from the plural class data in the memory, on the basis of the operation of (x-x, y-y) and ( 
so u*lx + vly + It), and their correspondence is determined. As a result the correspondence is made with the 
class data having the motion vector satisfying the constraint equation by the time-space differentiation near 
the space position on the image. From this correspondence, the class data changing means changes the 
class data so that the distance sum with one or more sample vectors may be smaller, from the 
corresponding class data and one or more sample vector data. o 
55 Accordingly, the class data is changed so that the distance to the sample vector may be statistically 
smaller, and, as a result, the class data comes to represent the motion vector (u, v) of a certain position (x, 
y) in the image. In this way, by collecting plural sample vectors from the vicinity in the image, it is expected 
that plural constraint equations of space differential may be obtained, and the motion vector is determined. 
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If there is a different motion near the object boundary, the distance from one sample vector to the other 
class data becomes longer, so that the demerit of the conventional smoothing may be avoided. 

BRIEF DESCRIPTION OF THE DRAWINGS 

5 

Rg. 1 is a structural diagram of detecting apparatus of motion vector in prior art. 
Fig. 2 is a structural diagram of a first embodiment of detecting apparatus of motion vector. 
Rg. 3 is a structural diagram of a second embodiment of detecting apparatus of motion vector. 
Fig. 4 is a principle diagram of estimation of motion vector in the first and second embodiments of 
10 detecting apparatus of motion vector. 

PREFERRED EMBODIMENTS 

Referring to the drawings, some of the embodiments of the invention are described in detail below. The 
75 first embodiment of the invention claimed in claim 1 is described by reference to Figs. 2 and 4. 

Rg. 2 is a structural diagram of a detecting apparatus of motion vector in the first embodiment of the 
invention. Rg. 4 is a diagram showing the principle of estimation of motion in the first embodiment. In Rg. 
2, numeral 101 is an A/D converter, 102, 103 are frame memories, 104 is a horizontal direction 
differentiating filter, 105 is a vertical direction differentiating filter, 106 is a time direction differentiating filter, 
20 107 is a horizontal direction differential image memory, 108 is a vertical direction differential image 
memory, 109 is a time direction differential image memory, 110 is a feature vector combining part, 111 is a 
random address generating part, 112 is a maximum likelihood class determining part as an example of 
nearest class determining means, 113 is a maximum likelihood class changing part, and 114 is a class data 
memory bank. 

25 The image entered through A/D is delayed by 1 frame time in the frame memory 102, and sequentially 
passes through the frame memory 103. 

The horizontal direction differentiating filter 104, vertical direction differentiating filter 105, and time 
direction differentiating filter 106 calculate the time-space differentiation (dl/dx, bVby, 6Vb\) of formula (2) 
necessary for estimation of motion vector in each pixel. TTie calculation results are respectively recorded in 

30 the horizontal direction differential image memory 107, vertical direction differential image memory 108, and 
time direction differential image memory 109. Since the image is sampled, the calculation of differentiation 
is replaced by calculation of difference. 

The random address generating part 111 generates the horizontal and vertical positions on the screen 
randomly by using random numbers. The feature vector combining part 110, receiving the image position 

35 (x, y), reads out the time-space differential value, (Ix, ly, It), at that position from the memories 107, 108, 
109, and generates the sample vector (Ix, ly, It, x, y) 1 [0 1 is transposition of matrix]. On the other hand, in 
the class data memory bank 114, n pieces of class data are stored. In this embodiment, the class data 
comprises the mean (s) of position vector p = (x f y) f .covariance matrix S of position vector, and 
covariance matrix M of luminance differential vector d = (Ix, ly, It) 1 ,in the sample vector (Ix, ly, It, x, y)* . 

40 The eigen vectors of matrix M normalized at norm 1 are supposed to be e1 , e2, e3 from the largest one 
of the corresponding eigen value. The initial value of class data is set so as to be equal in intervals on the 
screen with respect to the position, and the covariance matrices S and M are supposed to be unit matrix. At 
this time, the eigen vector e3 of M is particularly defined as (0, 0, 1). In the maximum likelihood class 
determining part 112, using the distance shown in formula (5), the sample vector combined at random 

45 positions is made to correspond to the class data at the shortest distance from it. 



/(P. d|s, , S it Mi) = ( P - s,)'Sr l (p - s f ) + In \Si\ 
+of 2 (d'e3,)* + i»«r?, (5) 

where 

o, = je3,|/e33, 

This correspondence is not limited to use the form of formula (5), wherein (e33) is an It component of 
eigen vector e3. In this embodiment , when one set of correspondence of sample vector and class data is 
obtained, the corresponding class data is sequentially changed in the maximum likelihood class data 
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changing part 113. As for the k-th sample vector.the identifier of its class data is supposed to be i. The 
class data of other than data i is not changed, and only data i is changed in the class data i in formulas (6), 
(7). (8). 



sr i >=sS t > + o(p-s^),0.0< ft <1.0 



= (1 - fl* W + flp - si*>)(p - s[*>)' t 0.0 < /? < 1.0 ( 7 ) 



is M/* +1 > = (1- 7 )W/ A, + 7 dd' t 0.0^7<1.0 (8) 



By applying the foregoing random address generation, feature vector combination, maximum likelihood 
class determination, and data change of maximum likelihood class into several samples, self-organization of 

20- class data is executed. That is, a set of pixels in the vicinity in the image and commonly satisfying the 
constraint of motion forms a set of identical class data. The random address generation is intended to 
prevent deviation of time for changing the sequential class data. Fig. 4 shows the principle of motion 
estimation in this embodiment. Considering the feature space of the luminance differential vector d = (Ix, 
ly, It) 1 , if all data are satisfying the constraint equation of the time-space differentiation, they should 

25 distribute on one plane as shown in the top of Fig. 4. Therefore, the expected value of dd* , that is, the first 
eigen vector and second eigen vector of covariance matrix M will span (determine) this plane. The third 
eigen vector is the unit normal vector of this plane, and supposing its component to be (e31 , e32,e33), it 
may be obtained as (u, v) = (e31/e33, e32/e33). 

The change of covariance matrix shown in formula (8) in the maximum likelihood class data changing 

30 part 113 corresponds to rotation of the plane shown in the top and bottom of Fig. 4. Accordingly, the third 
eigen vector e3 is set away from the sample vector, and the distance calculated from (ulx + v*ly + It) is 
decreased as for such sample vector. 

It is a feature of this embodiment that the motion vector estimation is determined as the eigen vector of 
covariance matrix. Incidentally, when determining only e3, it is possible to approximate sequentially by 

35 rotating the subspace expressed by e3 as shown in formula (9). In formula (9), I denotes a unit matrix of 3 
x 3. 

e3/* +1 > = e3/*>(A<dd , ) l O.O^<1.0 (9) 

40 where € is a constant. 

By determining the eigen vector of covariance matrix.the reliability of the determined motion vector can 
be evaluated. The rank of M is ideally 2, and supposing the eigen values to be X1 > X2 > X3, if the value of 
X2 / X3 is large, the reliability of the obtained motion vector is high. In this embodiment, moreover, since the 
class data are sequential changed , fewer memories are required. 
45 The second embodiment of the invention as claimed in claim 2 is described below by referring to Fig. 
3. Fig. 3 is a structural diagram of a detecting apparatus of motion vector in the second embodiment! 

In Fig. 3, numeral 201 is an A/D converter, 202, 203 are frame memories, 204 is a horizontal direction 
differentiating filter, 205 is a vertical direction differentiating filter, 206 is a time direction differentiating filter, 
207 is a horizontal direction differential image memory, 208 is a vertical direction differential image 
so memory, 209 is a time direction differential image memory, 210 is a feature vector combining part, 211 is a 
random address generating part, 212 is a maximum likelihood class determining part, 213 is a maximum 
likelihood class changing part, 214 is a class data memory bank, 215 is an RGB luminance memory, and 
216 is a sample stack by class. In the second embodiment, the RGB color vector is newly combined with 
the sample vector. The change of maximum likelihood class data is not sequentially changed as in the first 
55 embodiment, but is the repetition of the following two steps. 

Step 1) Pair of sample vector and class data are stored at least by ten sets or more for each class. 
Step 2) The class data is directly calculated from the plural sets. 
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Thus, expansion of sample vector and changing step of class data are different from the first 
embodiment. These differences are described below. 

The A/D 201 encodes the luminance and RGB value, and the RGB value is stored in the RGB 
luminance memory 215. The operation of the means from 202 to 209 and 211 is same as the function of 

5 the corresponding constituent elements in the first embodiment. The feature vector combining part 210 
receives the randomly generated image position (x, y) , and reads out the time-space differential value, (Ix, 
ry, It), RGB value (r, g, b), and generates the sample vector (Ix, ly, It, x, y, r, g, b) x . On the other hand, n 
pieces of class data are stored in the class data memory bank 214. In the second embodiment, the class 
data comprises mean (t) of color position vector q = (x, y, r, g, b) 1 , covariance matrix E of color position 

10 vector, and covariance matrix M of luminance differential vector d = (Ix.ly, it) 1 ,in the sample vector (lx,ly, It, 
x, y, r, g, b) f . Same as in the first embodiment, the eigen vectors of matrix M normalized at norm 1 are 
supposed to be e1 , e2 , e3 from the largest one of the corresponding eigen value. The initial value of class 
data is set so as to be equal in intervals on the screen with respect to the position, and the covariance 
matrices Z and M are supposed to be unit matrices. At this time, the eigen vector e3 of M is particularly 

75 definedas (0, 0, 1). 

In the maximum likelihood class determining part, using the distance expressed in formula (10), 
correspondence is made between the sample vector combined at random positions and the class data at 
the minimum distance therefrom. 

/(q,d|t i ,E<,Af i ) = (q- tO'Ef'Cq- ti) + ln|£i| 
+<r i - 2 (d'e3i) a + ln<r?, (10) 

25 where 

o, = |e3ij/e33/ 



30 The covariance matrix E is limited in a manner that the covariance C about color and covariance S 
about position are independent as shown in formula (11). 

\0 St J <"> 



40 In this embodiment, the correspondence of sample vector and class data is executed by a specified 
number of times in the maximum likelihood class determining part 212 (for example, number of classes n x 
about 100 times), and the corresponding sample vectors are stored in the sample stack 216 by class by 
each class identifier. Thereafter, on the basis of the sample stored in the sampling stack by class 216, 
operations of formulas (12), (13), (14) are performed by the maximum likelihood class changing part 213, 

45 and the mean \P\ of the color position vector and its covariance matrix E (,) , and covariance matrix M (,) of 
luminance differential vector are updated to t° +1) and E (,+1) , M (,+1> . 

In formulas (12), (13), (14), (mi) are number of samples in stack by class, and (1) denotes the number 
of times of updating. 

50 
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75 By repeating about 1 > 10 times, the sample vectors are clustered. By combining the color information, 
separation of different objects is improved, and the motion vector detection precision in the region is 
enhanced. The region segmentation combining color and motion is performed. In this embodiment, random 
address generation is not essential, but deviation of estimation can be prevented if the number of samples 
in the sample stack by class exceed the storing limit. 

20 According to the invention, as compared with the conventional detecting apparatus of motion vector on 
the basis of the time-space differential method, the following effects are brought about. 

(1) Repetitive operations are not necessary, and the processing time is shorter. 

(2) The region can be segmented by combination of color and other information. 

(3) Large estimation error does not occur in the region boundary. 

25 Hence, it can be applied in image coding, noise reduction, and object tracing, and its application effects 
are great. 

Claims 

30 1. A detecting apparatus of motion vector comprising: 

(a) a memory (102,103) for holding video signals which are coded and composed of frame units, 

(b) a horizontal differentiating filter (104) for reading out luminance data from the memory and 
differentiating (executing a differential method ) or differencing (executing a difference method) in a 
horizontal direction of arbitrary image coordinates, 

35 (c) a vertical differentiating filter (105) for reading out luminance datafrom the memory and 

differentiating or differencing in the vertical direction of the image coordinates, 

(d) a time differentiating filter(106)for reading out the luminance data of preceding and succeeding 
frames from the memory, and differentiating or differencing in a time direction, 

(e) feature vector combining means (110) for combining luminance differential vector (Ix, ly, It) 
40 comprising results of the horizontal differentiating filter, the vertical differentiating filter and the time 

differentiating filter, and position vector (x, y) of the image coordinates ,to obtain a sample vector, 

(f) a memory (114) for holding a plurality of an estimation of a motion vector (u, v) and a position 
vector mean (x, y) of the sample vector as class data, 

(g) nearest class determining means_(1 1 2)Jor determining a distance between the class data and the 
45 sample vector from a formula (x-x, y-y) and a formula (ulx + v*ly + It), and achieving a 

correspondence between the sample vector and nearest class data, and 

(h) class data changing means (113)for changing the class data in a manner that a distance sum 
between the class data and one or more sample vectors becomes smaller, from one or more sample 
vectors and the class data of which correspondence has been achieved by the nearest class 

so determining means. 

2. A detecting apparatus of motion vector comprising: 

(a) a memory for holding video signals which are coded and composed of frame units, 

(b) a horizontal differentiating filter for reading out luminance data from the memory and differentiat- 
55 ing (executing a differential method ) or differencing (executing a difference method) in a horizontal 

direction of arbitrary image coordinates, 

(c) a vertical differentiating filter for reading out luminance data from the memory and differentiating 
or differencing in the vertical direction of the image coordinates, 



(12) 
(13) 
(14) 
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(d) a time differentiating filter for reading out the luminance data of preceding and succeeding 
frames from the memory, and differentiating or differencing in a time direction, 

(d) sample vector combining means for reading out color data consisting of an array of one or more 
luminance values differing in color from the memory, combining (a) the color data, (b) luminance 

5 differentia) vector (Ix, ly, It) comprising results of the horizontal differentiating filter, the vertical 

differentiating filter and the time differentiating filter , and (c) position vector (x, y) of the image 
coordinates , to obtain sample vector, 

(e) a memory for holding a plurality of a position vector mean (x, y) , an estimation of a motion 
vector (u, v) ,and a mean of the color data of the sample vector , as class data, 

10 (f) nearest class determining means for_ determining a distance between the class data and the 

sample vector from a formula of (x-x, y-y) ,a formula of (u"lx + v*ly + It) and a difference between 
the color data and color data mean, and achieving a correspondence between the sample vector and 
nearest class data, and 

(g) class data changing means for changing the class data in a manner that a distance sum between 
75 the class data and one or more sample vectors becomes smaller, from one or more sample vectors 

and the class data of which correspondence has been achieved by the nearest class determining 
means. 
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Fig. 1 (PRIOR ART) 
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Fig. 2 
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Fig. 3 
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